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Abstract 

This paper presents a new Bayesian model and algorithm used for depth and intensity profiling 
using full waveforms from the time-correlated single photon counting (TCSPC) measurement in 
the limit of very low photon counts. The model proposed represents each Lidar waveform as a 
combination of a known impulse response, weighted by the target intensity, and an unknown constant 
background, corrupted by Poisson noise. Prior knowledge about the problem is embedded in a 
hierarchical model that describes the dependence structure between the model parameters and their 
constraints. In particular, a gamma Markov random field (MRF) is used to model the joint distribution 
of the target intensity, and a second MRF is used to model the distribution of the target depth, which 
are both expected to exhibit significant spatial correlations. An adaptive Markov chain Monte Carlo 
algorithm is then proposed to compute the Bayesian estimates of interest and perform Bayesian 
inference. This algorithm is equipped with a stochastic optimization adaptation mechanism that 
automatically adjusts the parameters of the MRFs by maximum marginal likelihood estimation. 
Finally, the benefits of the proposed methodology are demonstrated through a serie of experiments 
using real data. 
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I. Introduction 

Reconstruction of 3-dimensional scenes using time-of-flight laser imaging detection and 
ranging (Lidar) systems is a challenging problem encountered in many applications, including 
automotive 0-0 , environment sciences 0, 0, architectural engineering and defence 
Q, 0. This problem consists of illuminating the scene with a train of laser pulses and 
analyzing the distribution of the photons reflected from the targets to infer the range, as 
well as intensity information about the scene. Using scanning systems, an histogram of time 
delays between the emitted pulses and the detected photon arrivals is usually recorded for 
each pixel, associated with a different region of the scene. The recorded photon histograms 
are classically decomposed into a series of peaks whose positions can be used to infer the 
distance of the object(s) present in each region of the scene and whose amplitudes provide 
information about the intensity of the objects. 

In this paper, we consider applications where the flux of detected photons is small and for 
which classical methods usually provide unsatisfactory results in terms of range and intensity 
estimation. This is typically the case when the acquisition time or the laser source power are 
small relative to the range of the target(s) 0. 


Recently, Kirmani el al. (101, investigated a new method for reconstructing 3-dimensional 
scenes under low photon flux conditions by registering the time of arrival of the first photon 
in each pixel. Based on an appropriate statistical model relating the time of arrival to the 
target distance and intensity, they proposed different processing steps, to handle ambient 
noise (background photons) and the spatial regularity of the scene to obtain estimated target 


distances and intensities. In contrast with the method proposed in [10], we consider a 
scanning system whose acquisition time per pixel is fixed, thus avoiding any potential 
synchronisation issues and leading to a deterministic and user-defined overall acquisition 
duration. Consequently, the number of detected photons can be larger than one for some 
pixels, whereas some pixels may be empty, i.e., contain no detected photons. 

In this work, we assume that the targets of interest are opaque, i.e., are composed of a single 
surface per pixel and that at least a surface is present in each scanned pixel. Generalization 


to more complex objects will be discussed in the conclusion of this paper. As in [10], we 
consider the presence of two kinds of detector events: the ’’useful” photons originating from 
the illumination laser and scattered back from the target; and those background detector 
events originating from ambient light and the ’’dark” events resulting from detector noise. 
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The proposed method aims to estimate the respective contributions of the actual target and the 
background in the photon timing histograms. More precisely, it also allows the estimation of 
the distance and intensity of the surface associated with each pixel, together with the average 
background levels, within a single estimation procedure. 


Adopting a Bayesian framework as in [111, [12], we assign prior distributions to the 
unknown model parameters to include available information (such as parameter constraints) 
within the estimation procedure. In particular, Markov random fields (MRFs) are introduced 
to model spatial correlations for the target distances and intensities. The joint posterior 
distribution of the unknown parameter vector is then derived, based on the Poisson statistical 
properties of the observed data. Since classical Bayesian estimators cannot be easily computed 
from this joint posterior, a Markov chain Monte Carlo (MCMC) method is used to generate 
samples according to this posterior. More precisely, we construct an efficient stochastic 
gradient MCMC (SGMCMC) algorithm [ |T3] ] that simultaneously estimates the background 
levels and the target distances and intensity, along with the MRFs parameters. 

The main contributions of this work are threefold: 

1) We develop new hierarchical intensity and depth models taking into account spatial 
correlations through Markovian dependencies. These flexible models are embedded 
within the observation model for full waveform Lidar-based low photon count imaging 

2) An adaptive Markov chain Monte Carlo algorithm is then proposed to compute the 
Bayesian estimates of interest and perform Bayesian inference. This algorithm is equipped 
with a stochastic optimization adaptation mechanism that automatically adjusts the 
parameters of the Markov random fields by maximum marginal likelihood estimation, 
thus removing the need to set the regularization parameters by cross-validation. 

3) We show the benefits of the proposed flexible model for reconstructing a real 3D 
object in scenarios where the number of detected photons is very low. Specifically, we 
demonstrate the ability of the proposed algorithm to handle empty pixels as well as 
background levels. 

The remainder of this paper in organized as follows. Section [TT] recalls the classical 
statistical model used for depth imaging using time-of-flight scanning sensors, based on 
TCSPC. Section |nT] presents the proposed hierarchical Bayesian model for low photon count 
depth imaging, which takes into accounts in the inherent spatial correlations between pixels. 
The estimation of the Bayesian model parameters using adaptive MCMC methods is discussed 


in Section IV Simulation results conducted using an actual time-of-flight scanning sensor 
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are presented and discussed in Section [V| Finally, conclusions and potential future work are 


reported in Section VI 


II. Problem formulation 

We consider a set of JV row x7V co i observed Lidar waveforms/pixels y ?; j = [y, i,..., ( i,j ) G 

{1,..., N row } x {1,..., Ay () |} where T is the number of temporal (corresponding to range) 
bins. To be precise, y lt]i t is the photon count within the /th bin of the pixel or location 
Let tij be the position of an object surface at a given range from the sensor. According to 
0. each photon count y%j.t is assumed to be drawn from the following Poisson distribution 

Vi,j,t ~ V (r i}j g 0 (t - t^) + b id ) (1) 

where y 0 (■) > 0 is the photon impulse response , r t j denotes the intensity of the target and 
b it j stands for the background and dark photon level, which is constant in all bins of a given 
pixel. The photon impulse response g 0 (-) is assumed to be known and estimated during the 
imaging system calibration. 

Due to physical considerations, the target intensity is assumed to satisfy the following 
positivity constraints 

r,„j > 0, Vi,j. (2) 

Similarly, the average background level in pixel satisfies b hJ > 0, Vi, j. The problem addressed 
in this paper consists of estimating the position (i.e., tij) of the targets, their intensity r tJ and 
the background levels bij from the observed data gathered in the N mw x N co \ x T array Y. 

The next section studies the proposed Bayesian model to estimate the unknown parameters 
in (|T[) while ensuring the positivity constraints mentioned above. 

III. Bayesian model 

The unknown parameter vector associated with (|T[) contains the surfaces intensity R, the 
position of the target T and the background levels B (satisfying positivity constraints), where 
[R] i ■ = r^j, T ( ^ = tij and [B] i ■ = b hJ . This section summarizes the likelihood and the 
parameter priors associated with these parameters. 
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A. Likelihood 

Assuming the Poisson noise realizations in all bins and for all wavelengths are independent, 
Eq. ([]} leads to 

T 

P{ Y|R, B T) = H f] -PPj exp _Ai ' i ’ t (3) 

i,j t =1 Vi ’iL 

where \ j>t = r^g^t - t i}j ) + b id . 

B. Prior for the target positions 

To account for the spatial correlations between the target distances in neighboring pixels, 
we propose to use a Markov random field as a prior distribution for given its neighbors 
Ty(ij), he., = f(t id |T v( ij)), where V(i, j) is the neighborhood of the pixel (ij) 

and = {te ,f} u' More precisely, this paper focuses on the following discrete MRF 


/(T|c) = exp [-cfi T)], Vtij e {1,.. ■, T}, (4) 

Lr[C) 

where c < 0 is a parameter tuning the amount of correlation between pixels, G(c) is a 
normalization (or partition) constant and where </>(■) is an arbitrary cost function modeling 
the correlation between neighbors. In this work we propose to use the following cost function 


^( T ) ^2 1^4 - 

hj ( i',j')eV(i,j ) 


(5) 


which corresponds to a total-variation (TV) regularization [14], [15] promoting piecewise 
constant depth image. Moreover, the higher the value of c, the more correlated the neighboring 
pixels. Several neighborhood structures can be employed to define V(i, j). Fig. [I] shows two 
examples of neighborhood structures. Here, the eight pixel structure (or 2-order neighborhood) 
will be considered in the rest of the paper for the depth parameters, as it provides in practice 
smoother depth images. Different cost functions (e.g., quadratic functions) could also have 
been used to model the depth correlations, depending on the application. Since the depths are 
assumed to take a finite number of values, it would not significantly change the estimation 


procedure presented in Section IV 


C. Prior for the target intensity 

In the absence of background, i.e., if b h j = 0, gamma distributions are conjugate priors 
for each intensity parameter r^j. Consequently, it seems natural to consider gamma priors 
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for the intensity. As will be shown later, gamma priors are still conjugate distributions when 
bij > 0. We propose to assign r l3 the following gamma prior 


A, 


Q ( «o, — 
a o 


( 6 ) 


where a l3 > 0 is a local parameter related to the prior mean and mode of r l3 and q 0 > 0 
is a global parameter which controls the shape of the distribution tails and thus the prior 
deviation of r i3 from a l3 . A key feature of hierarchical models is their capacity to encode 
dependence and act as pooling mechanisms that share information across covariates to im¬ 
prove the inference. Here we specify <( 6 ]) to reflect the prior belief that intensity exhibit spatial 
correlations. In particular, due to the spatial organization of images, we expect the values of 
r l3 to vary smoothly from one pixel to another. In order to model this behaviour, we specify 
a %3 such that the resulting prior for R is a hidden gamma-MRF (GMRF) fT 6 | . 

More precisely, we introduce an (7V row +1) x (7V col + 1) auxiliary matrix F with elements 
7 ij G M + and define a bipartite conditional independence graph between R and F such 
that each r l3 is connected to four neighbor elements of F and vice-versa. This 1st order 
neighbourhood structure is depicted in Fig. [2j where we notice that any given r l3 and r i+lj 


are 2nd order neighbors via "fi+i.j and 7 ,+ 1 J+ 1 . We specify a GMRF prior for R, F [16|, and 
obtain the following joint prior for R, T 


/(R,r|a„) = 


n 




G(ao) 11 lJ 

y ’ (ij)eVa 


X 


X 


n (7 i',j') 

(i'j')eVr 


—(“o+l) 


n ex p 


^0 ^i,j 

47 ^ f 


(7) 


where V R = {1,..., N mw } x {1,..., lV col }, V r = {1,..., N mw + 1} x {1,..., A^ col + 1}, and 
the edge set E consists of pairs ((i, j), (■ i',j ')) representing the connection between r l3 and 
7 i'ji. It can be seen from ([7]) that 


i.j | r, cxq 

h,] | R'? Ho 


^i,j (r'J 


Q ( 

V a o 

ig (ao,a 0 /9ij(R)) 


( 8 a) 

( 8 b) 


where 


a 


— 4 ( 7 jj + 7 + 7i,/-i + Ti-iy-i) 
A,j(R) = i r i,j + A+ 1 J + r iJ+ 1 + r i+ i ij+ i) /4. 


-1 
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Notice that we denote explicitly the dependence on the value of a 0 , which here acts 
a regularization parameter that controls the amount of spatial smoothness enforced by the 
GMRF. Following an empirical Bayesian approach, the value of oq remains unspecified and 
will be adjusted automatically (together with the depth parameter c) during the inference 
procedure by maximum marginal likelihood estimation. 

Finally, it is worth mentioning that this intensity model has similarities with the depth 
model ((4]) in the sense that spatial correlation is used to regularize the parameter estimation 
problem. However, the depth model (|4]) follows a segmentation approach in which the target 
depths are assumed (and constrained) to take values in a finite set. This leads to a piecewise 
constant depth representation which is usually sufficient for most applications, due to the 
depth resolution of the recent Lidar-based imaging systems. 

The intensity model proposed in this paper provides a spatially smooth representation of 
the intensities that is possibly more realistic than a piece-wise constant representation (that 
would result from a TV-based intensity regularization), as it does not enforce the intensities 
to take a finite number of possible values. 


D. Prior for the background levels 

In a similar fashion to the intensities, when j = 0, gamma distributions are conjugate 
priors for and the following Gamma priors 

bi,j ~ G (v, v) (9) 


are assigned to bij, where r) > 0 and v > 0 arc fixed hyperparameters. In order to reflect 
the lack of prior knowledge about the background levels, ( //, u) is arbitrarily set to obtain a 
weakly informative prior (( 77 , u) = (1,10) in all results presented in this paper). However, 
( 77 , u) can be easily adapted if additional information, e.g., observation conditions, is available. 


It could also be estimated as in [17], however the choice ( 77 , zv) = (1,10) made here has a 
limited impact on the estimation performance. 

Assuming prior independence between the average background levels of the different 
pixels, we obtain 


/(B 1 77, u) = Y[f(bi,j\ri,v). 


GO) 
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E. Joint posterior distribution 

We can now specify the joint posterior distribution for T. B, R and T given the observed 
waveforms Y and the value of the spatial regularization parameters c and a 0 (recall that their 
value will be determined by maximum marginal likelihood estimation during the inference 
procedure). Using Bayes’ theorem, and assuming prior independence between T, (T. R) and 
B, the joint posterior distribution associated with the proposed Bayesian model is given by 

/( T. B, R,r|Y, « 3 , V: v) cx /( Y|R, B. T)f(B\ V , v)f(T\c)f(R, r|a 0 ). (H) 

For illustration, Fig. [3] depicts the directed acyclic graph (DAG) summarising the structure 
proposed Bayesian model (recall that R. F have a bi-partite neighbourhood structure, which 
is illustrated in the graphical model of Fig. [2]). 

IV. Bayesian Inference 

A. Bayesian estimators 

The Bayesian model defined in Section [III] specifies the joint posterior density for the 
unknown parameters T, B, R and F given the observed data Y and the parameters c and o 0 . 
This posterior distribution models our complete knowledge about the unknowns given the 
observed data and the prior information available. In this section we define suitable Bayesian 
estimators to summarize this knowledge and perform depth imaging. More precisely, we 
propose to use the following two Bayesian estimators: 1) the minimum mean square error 
estimator (MMSE) of the intensity matrix 

Rmmse = E[R|Y,c,« 0 ], (12 ) 

where the expectation is taken with respect to the marginal posterior density /(R| Y, c, a 0 ) (by 
marginalizing T. B and F this density takes into account their uncertainty); 2) the maximum 
a posteriori (MAP) estimator of target positions 

T M ap = argmax/(T|Y, c, d 0 ), (13) 

T 

which is particularly adapted to estimate discrete parameters. Notice that in ( [I2| ) and ( fT3| ), 
we have set c = c and o 0 = d 0 , which denotes the maximum marginal likelihood estimator 
of the MRF regularisation parameters c and a (l given the observed data Y, i.e., 

(c,d 0 )= argmax /(Y|c,a 0 ), (14) 

ceK+,a 0 eK+ 
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This approach for specifying (c, a 0 ) ' s taken from the empirical Bayes framework in which 
parameters with unknown values are replaced by point estimates computed from observed 
data (as opposed to being fixed a priori or integrated out of the model by marginalization). 
As explained in [13], this strategy has several important advantages for MRF parameters 
with intractable conditional distributions such as (c,a 0 ). In particular, it allows for the 
automatic adjustment of the value of (c, «o) for each image (thus producing significantly 
better estimation results than using a single fixed value of (c, a 0 ) for all data sets), and has 
a computational cost that is several times lower than that of competing approaches, such 
as including (c, a 0 ) in the model and subsequently marginalising them during the inference 
procedure [18]. 


B. Bayesian algorithm 


Computing the estimators ( [T2] ) and ( [13] ) is challenging because it involves calculating 
expectations with respect to posterior marginal densities, which in turn require evaluating the 
full posterior ( [TT| ) and integrating it over a very high-dimensional space. Computing (c, do) 
is also difficult because it involves solving an intractable optimisation problem, (because it is 
not possible to evaluate the marginal likelihood /(Y|c, Oo) or its gradient V/(Y|c, o 0 )). Here 
we adopt the approach proposed in [ 13] and design a stochastic optimisation and simulation 
algorithm to compute (jT2[) and ( [13] ) simultaneously. That is, we construct a stochastic gradient 
Markov chain Monte Carlo (SGMCMC) algorithm that simultaneously estimates (c, do) 
and generates a chain of A r MC samples {R/"d B*™). T^J-^f asymptotically distributed 

according to the marginal density /(T. B, R, TjY, c, do) (this algorithm is summarised in 
Algo. [T] below). Once the samples have been generated, the estimators ( [12] ) and ( [13] ) are 


approximated by Monte Carlo integration [T9] Chap. 10], i.e., 

-i Nyic 

E R(n) > 


R 


■MMSEj 


Ymc — A) 


(15) 


bi 


n=N hi +l 


and 


(Aj) 


N M c 


MAPj 


argmax & [t 

kel,...,T ltr I 1 ' 

’ ’ n=A r bi+l 


X n ) 

'i,j 


(16) 


where the samples from the first N bi iterations (corresponding to the transient regime or 
burn-in period) are discarded and where <:)(•) denotes the Kronecker delta function. The main 
steps of this algorithm are detailed in below. 
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1) Sampling the target positions: It can be seen from ( pT| ) that 
f(Uj = t\Y. T Vd , B, R, T, c, a 0 , V, v) 

oc = t, r itj , bij)f{tij = f|T V (jj)). (17) 


Consequently, sampling the target positions can be achieved by sampling sequentially each 
position from its conditional distribution, i.e., by drawing randomly from {1,...,T} with 
known probabilities. In our experiments we used a Gibbs sampler implemented using a 
colouring scheme such that many positions can be updated in parallel (9 steps required when 
considering a 2-order neighborhood structure). 

2) Sampling the intensity coefficients: Similarly, from (jTTJ) we obtain 
/(R|Y. T. B, r. c, a 0 ,r/, u) 


= II /(Gj|y;j, Uj, bij, r, a 0 ) ( 18 ) 

i.e., the elements of R are a posteriori independent (conditioned on the other parameters) 
and can thus be updated simultaneously. Moreover, 


f (gj |y*j, t%,j ) bij i r, oq) 


oc 1 exp 


*0 r i,j 

hJCt) 


HAS 4 exp 


t =i 


(19) 


with A i dtt = r id g 0 (t — t id ) + b ld . By noticing that Ylt=i KpT I s a polynomial function of r h] , 
whose degree is Y^t=i Vi,j,t (since g 0 (t — t id ) > 0, Vf) , it turns out that ( |~i~9l ) can be expressed 
as a finite mixture of gamma distributions whose weights and parameters are known (see 
Appendix for the derivation of (fT9])). Sampling r id from its conditional distribution finally 
reduces to randomly selecting one of components of the mixture and then sampling from the 
corresponding gamma distribution. Note that the auxiliary variables in F do not appear in 
the likelihood Q and that sampling F from its conditional distribution reduces to sampling 
from ( [8b] ). 

3) Sampling the background: Similarly analysis applies when sampling the background 
levels. Precisely, 

/(B|Y. T. R, r. c, a 0: r /, u) 


= II }{b%j |yij> Kv Gj, V, v) (20) 

i.e., the elements of B are a posteriori independent (conditioned on the other parameters) 
and can thus be updated simultaneously. Moreover, 
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f( b ij\yij^ij,rij,r],v) 


oc tf-j 1 exp' 




( 21 ) 


t =i 


which can also be expressed as a finite mixture of gamma distributions. 

4) Updating the MRF regularization parameters c and a 0 -' If marginal likelihood /(Y|c, o- 0 ) 
was tractable we could update (c, a 0 ) from one MCMC iteration to the next by using a classic 
gradient descent step 


a , 


(n+l) 


= a 


(n) 


+ £ri 


_d_ 

<9ct c 

d 


log/(Y|c^,4 n) ), 


c (n+l) = c (-) + ^ l 0 g /CY | c (-) a (-) ) 


( 22 ) 


with = n -3 / 4 , such that cp," ! (resp. c^) converges to d 0 (resp. c) as n -A oo. However, this 
gradient has two levels of intractability, one due to the marginalization of (T. B, R. T) and 
another one due to the intractable normalizing constant of the gamma MRF. We address this 
difficulty by following the approach proposed in [ 13]; that is, by replacing V log / (Y| c^a^) 
with estimators computed with the samples generated by the MCMC algorithm at iteration n, 
and two sets of auxiliary variables. More precisely, we generate (R 7 , T 7 ) ~ /C 1 (R.r|RW.r (n) ,a[ ) ri “ 1) ) 
with an MCMC kernel /Ci with target density ([7]) (in our experiments we used a Gibbs sampler 
implemented using a colouring scheme such that all the elements of R 7 and F 7 are generated 
in parallel). We also generate T 7 ~ /C 2 (T|T^, c^ n ~ 1} ) with an MCMC kernel /C 2 with target 
density Q. The updated value oj" 1 (resp. c l ' n} ) is then projected onto an interval [0, A n ] (resp. 

[0, C„], see (10:) and (11:) in Algo |TJ) to guarantee the positivity constraints c, a 0 G M + and 
the stability of the stochastic optimization algorithm (we have used A t = C n = 20). 

It is worth mentioning that if it was possible to simulate the auxiliary variables (R 7 ,r 7 ) 

(resp. T 7 ) exactly from (|7|) (resp. @), then the estimator of V log f(Y\c (n \ a^ 1 ' 1 ) used in 
Algo. 0 would be unbiased and as a result (c( n \ CKq)) would converge exactly to (c, d 0 ). 
However, exact simulation from ([7]) and Q is not computationally feasible and therefore we 
resort to the MCMC kernels /Ci and /C 2 and obtain a biased estimator of V log j'(Y\d n) . a^) 
that drives to a neighbourhood of (c, do) [13]. We have found that computing this 

biased estimator is significantly less expensive than alternative approaches, (e.g., using an 


approximate Bayesian computation algorithm as in ]18|), and that it leads to very accurate 
depth and intensity results. 
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V. Simulation results 

A. Data acquisition 

We propose comparing the performance of the proposed method to reconstruct a depth 
image of a life-sized polystyrene head located at a distance of 40m from a time-of-flight 
scanning sensor, based on TCSPC. The transceiver system and data acquisition hardware 
used for this work is broadly similar to that described in |9|. [ |20] [221, which was previously 
developed at Heriot-Watt University. For the measurements reported in this section, the optical 
path of the transceiver was configured to operate with a fibre-coupled illumination wavelength 
of 841 nm, and a silicon single-photon avalanche diode (SPAD) detector. The overall system 
had a jitter of 95ps full width at half-maximum (FWHM). The measurements have been 
performed outdoors, on the Edinburgh Campus of Heriot-Watt University, in November 2014 
under dry clear skies and with atmospheric conditions remaining relatively constant for the 
duration of the measurement. The key measurement parameters are summarized in Table |T] 
The acquisition time per pixel in Table [T] is 30ms. However, the data format of timed events 
allows the construction of photon timing histograms associated with shorter acquisition times, 
after measurement, as the system records the time of arrival of each detected photon. Here, 
we evaluate our algorithms for acquisition times of 30ms, 6ms, 3ms, 600/zs, 300/zs and 60/us 
per pixel. 

The instrumental impulse response go (■) is estimated from preliminary experiments by 
analysing the distribution of photons reflected onto a Spectralon panel (a commercially 
available Lambertian scatterer), placed at 40m from laser source/detector. A long acquisition 
time (60s) is considered here to reduce the impact of the photon count variability and a 
pre-processing step is used to remove the constant background in the measured response. 
The resulting instrumental impulse response is depicted in Fig. [5j 

Table [D] provides details regarding the amount of detected photons when varying the 
acquisition time. The second row of |TT] shows the average number of detected photons 
per pixel, ranging from about 0.85 photons for a 60/rs acquisition time to about 420 for a 
30ms acquisition time. As expected, the number of detected photons increases linearly with 
exposure. The bottom row shows that almost 49% of the pixels do not contain any detected 
photons for a 60/vs acquisition time and that this proportion decreases when increasing the 
acquisition time. 
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B. Estimation performance 


The proposed method is compared to the method classically used for depth imaging |9j 
and which is divided into two steps. The first step consists of estimating using cross¬ 
correlation between g 0 (-) and the photon histogram y h j. The object depth is estimated using 

T 

ii,j, con- = argmax ^ ih.j.tM ( t-T ). (23) 


Once the estimated time target distance h.j.am has been computed, the target intensity is 
estimated using maximum l ik elihood (ML) estimation (assuming that b n = 0) as 


r i,j,ML ~ 


2-it =i 


Y2t=l9o{(t Lj.corr) 


(24) 


When the background level is relatively low compared to the maximum value of r^go ((t — t^f), 
the ML intensity estimates (conditioned on the previously estimated depths) provide satisfac¬ 
tory results and are thus consider as the comparative method in the remainder of this paper. 
The proposed algorithm has been applied with N M c = 1000 iterations, including JV bi = 200 
burn-in iterations. 

Fig. [6] compares the estimated depth maps obtained by the standard and the proposed 
methods. These results show that for large acquisition times, the two methods provide similar 
results. However, when the acquisition time decreases, the cross-correlation method starts to 
fail in identifying the target positions, especially in pixels where no photon is detected in 
a pixel, indicating that the proposed method seems more robust to the absence of signal in 
some pixels. 

Fig. [7] compares the estimated intensity maps obtained by the two methods. These results 
show that the two methods provide similar results for the longest acquisition times and that 
the proposed method is more robust to the lack/absence of detected photons. In particular, 
for the 60/iS acquisition time, few photons are detected in the pixels around the head and 
the proposed algorithm provides a smoother intensity image due to consideration of spatial 
correlations, in contrast to the standard method which process the pixels independently. Note 
that for each experiment, the Spectralon response g 0 (-) is scaled to account for the acquisition 
time (e.g., amplitude divided by ten between the 30ms and 3ms experiments). Note also that 
for some pixels, the intensities estimated by the two methods can exceed 1. This point will be 
discussed further in the conclusions. Fig. [8] compares the depth/intensity estimation results 
obtained by the two methods for an acquisition time of 300//S. This figure illustrates the 
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ability of the proposed model to handle low photon returns using the spatial correlation of 
the depths and intensities. 

In addition to the depth and intensity maps, the proposed method also estimates the average 
background level in each pixel, depicted in Fig. [9j This figure shows that for the longer 
acquisition times, higher backgrounds are estimated at the boundary between the head and 
the backplane, which can be explained by the presence of two peaks in the histograms of 
detected photons. Due to the laser beam size, some photons are reflected onto the head 
whereas others are reflected onto the backplane and thus arrive later onto the detector. When 
the number of detected photons decreases, the amplitudes of the two peaks decrease, which 
makes the detection of multiple peaks more difficult. 

The performance of the methods are quantitatively evaluated using the distance and inten¬ 
sity mean squared errors (MSEs) defined by 


where 


MSE{d u ) = 


d>i,i d'i- 7 


denotes the £ 2 - nom T di / is the estimated value of di 


MSE{r it j ) = || fi,j ~ n. 


3 II2 


(25) 

3 x 10 8 tjj/2 and 

(26) 


where fij is the estimated value of r hJ . Note that {d-,^} and {ryare unknown for the data 
set considered. Consequently we replace these values by those estimated by the standard 
method for the longest acquisition time (30ms). Figs. [TO] and |TT] depict the cumulative density 
functions (cdfs) of the distance and intensity MSEs, defined by 


FJr) = 


FJt) = 


1 


ddrov/dd C o\ 


N N , 

1 v row i v col 


£ l«w (MSE(d tJ )) 


1,3 


Dm (MSE(nj)) 


(27) 

(28) 


hj 


where 1 


(0,r) 


denotes the indicator function defined on (0, r). Figs. 10 and 11 show that the 


proposed method is more robust than the standard method when reducing the acquisition time 
and provide more consistent results in terms of depth and intensity estimation. These figures 
also highlight the ability of the proposed method to process pixels for which no photons are 
detected, as the cdfs are upperbounded by the proportion of pixels that can be processed by 
each method. 


Finally, Table III compares the computational costs of the two methods to process the 
whole image (142 x 142 pixels, T = 586), for the different acquisition durations. Due to 
the use of the MCMC method, the proposed method is significantly more computationally 
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demanding than the standard method. However, this cost must be balanced by the performance 
improvement in terms of depth and intensity estimation, when the number of detected photons 
is low. When the flux of detected photons is large enough, the consideration of spatial 
correlations has a limited impact on the estimation performance, as long as the background 
levels are low compared to the amplitudes of the peaks associated with actual targets. It 
is interesting to note the computational cost of the Bayesian increases with the acquisition 
time, in contrast to the standard method. This is mainly due to the MCMC steps used to 
updates the intensity coefficients and the background levels which require the computation of 
polynomial coefficients whose number depends on the number of detected photons in each 
pixels (see Appendix). 


VI. Conclusion 


In this paper, we proposed a new Bayesian model for Lidar-based low photon count imaging 
of single-layered targets. In the Bayesian framework, prior distributions were assigned to 
the unknown target depths and intensity to account for the intrinsic correlations between 
neighboring pixels. An adaptive Markov chain Monte Carlo method was then developed 
to estimate the model unknown parameters, including the spatial regularization parameters, 
thus relieving practitioners from setting these parameters by cross-validation. The model 
and method were validated using real Lidar data and the results showed the benefits of the 
proposed approach compared to the classical method used when the number of detected 
photons is low. 

In the paper, we assumed that the beam associated with a given pixel is incident on a 
single surface. This assumption is reasonable for small beam sizes, compared to the target 
distance and when the scene is composed of locally continuous surfaces. When the beam 
encounters multiple surfaces, one peak will be considered as principal surface, depending on 
its amplitude and on the Lidar returns in the neighboring pixels. The remaining peaks will 
be considered as part of the background noise. Considering returns from multiple surfaces is 
an interesting problem already addressed in [5j, [11], [12| for applications where the number 
of detected photons are significantly higher. It would be interesting to extend this work for 
the low-photon imaging problem. 

Since the model considered in this paper assumed the presence of a target in each pixel, 
the proposed method will tend to process empty pixels (i.e., containing no photon) using the 
neighboring pixels, which might be inaccurate for non-locally continuous surfaces (such 
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as wire fence). Accounting for the absence of target in some pixels is currently under 
investigation. 

In the results presented in Section [V| some of the estimated intensity coefficients were 
significantly greater than one, even for long acquisition time and even when scaling the instru¬ 
mental response. Thus, the estimated intensities cannot be directly related to target reflectivity 
values as these estimation errors do not seem to be only due to estimation errors when 
extracting the instrumental impulse response. Thus constraining the intensity coefficients to 
be less than one might not be sufficient to provide accurate reflectivity estimates. As studied 


in [231-|25|, atmospheric perturbations can have a significant impact on the distribution of 
the detected photons, specially for long-range targets. Although such scintillation effects will 
have a small impact on the depth estimation, accounting for them will be necessary in future 
work to improve the reflectivity estimation. 


Appendix: On the conditional distribution of the intensity coefficients 

The conditional distribution /(Aj|yij, i %/] . b t , r T, o 0 ) can be expressed (up to a multiplica¬ 
tive constant) as 

f (t ij | y i,ji ti,j j hi,j j r, Q.q ) 


■ x O r i,j 


oc r. 


ij ' exp exp 


(29) 


t =i 


where Y2t= i ~ u i,j + r i,j v i,j’ 


u i,j — 


Vi,j = 


t .=i 
T 

'y ^ 9o{t—ti,j)> 


t =i 


and 


p^) =n a sj‘= n (*•««>(« - hi) + K) y,M pa) 

t= i t =i 

is a polynomial function of r M -. Since g 0 (t — t lt] ) > 0,Vf, —bij/go{t — t it j) is a root of 
if y l j, t > 0. Moreover, this root is of multiplicity and the polynomial order is 
thus Oij y i yi,j,t• Let 

P(r>,i) = £ wtj, (31) 


k =0 
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be the polynomial expansion of P(r it j), whose coefficients {e k } can be obtained from the 
polynomial roots. From Eq. ( [29] ), we obtain 

/ ( r i,j Iy i,ji tij > J > r, a0) 


o, 


E an+k-l ri JUj.-(r) T "‘J 

exp V ^ 

fc =0 




hj 


which can be expressed as the following mixture of () l ,j + 1 gamma distributions 

Oi,i / / 

tij, bij, r, a 0 ) = ^ w k g nj ; a 0 + k, (- j— + v t . 

k =o V 

with 


r (a 0 + k) 

Wkcce k - - 


0=0 


ai,j(r) + 


where T(-) denotes the Gamma function and J2k=o w k — 1- 


(32) 


(33) 


(34) 
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Algorithm 1 


Proposed MCMC algorithm 


1: Fixed input parameters: Lidar impulse response go (■), number of burn-in iterations TVy, total number of 
iterations Nmc 
2: Initialization (n = 0) 



3: Iterations (1 < n < Nmc) 

4: Sample from ( fl7] > 

5: Sample R ( ") from ( fl8| ) 

6: Sample B (n ) from ( [20] ) 

7: Sample r (n) from <[8b) 

8: if n < TVbi then 

9: Sample (R', r') ~ £i(R, r|R(">, r (n) , 

10: Sample T' ~ /C 2 (T|T( n \ c^" 1 )) 



12 : Set C W = V [0 ,c„] (c (n_1) + [0(T') - 0(T<">)]) 


13: else 

14: Set = c (n_1) 



16: end if 

17: Set n = n + 1. 

18: Output {RWjW}^. 
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Fig. 1. 4-pixel (left) and 8-pixel (right) neighborhood structures. The pixel considered appears as a black circle whereas 
its neighbors are depicted in white. 
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Fig. 2. Proposed 1st order GMRF neighborhood structure V(i, j) G Vr. We set r, j = 0.1, V(j, j) ^ Vr 



Fig. 3. Graphical model for the proposed hierarchical Bayesian model (fixed quantities appear in boxes). 
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Fig. 4. Photograph showing the polystyrene head used for the experiments described here and calibration targets, including 
the Spectralon panel (top right corner). 
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Fig. 5. Intrumental response obtained using spectralon panel placed at 40m from laser source/detector and for an acquisition 
time of 60s (jitter rts 95ps FWHM). 
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Fig. 6. Depth maps for the 40m target and for different per-pixel acquisition times, estimated hy the proposed Bayesian 
algorithm (top rows) and the standard method (bottom rows). Distances shown are in centimeters and the reference distance 
is the distance of the backplane. Black pixels correspond to pixels where no photon are detected and for which the cross¬ 
correlation method cannot identify the target distance. 
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Fig. 7. Intensity maps for the 40m target and for different per-pixel acquisition times, estimated by the proposed Bayesian 
algorithm (top rows) and the standard method (bottom rows). 
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X 


Fig. 8. Depth/intensity reconstruction of the target, estimated by the Bayesian (left) and standard (right) methods for a 
300)rs acquisition time. The colours represent the target intensity (dark blue for low intensity coefficients). 
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Fig. 9. Background level maps for different acquisition time, estimated by the proposed Bayesian algorithm. 
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Fig. 10. Distance RMSE cdfs provided by the standard (blue) and the proposed (red) methods for the target located at 
40m. 
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Fig. 11. Intensity RMSE cdfs provided by the standard (blue) and the proposed (red) methods for the target located at 
40m. 
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Target Stand-off Distance 

« 40m 

Target Scene 

Polystyrene head 

(« 170 x 285 x 250mm 

in W x H x D when 

viewed from the front) 

mounted on a breadboard. 

Backplane: MDF board. 

(see Fig 4i 

Laser system 

Supercontinuum 

laser source and 

tunable filter 

(NKT Photonics) 

fibre-coupled to the 

custom-designed transceiver unit 

Ilium. Wavelength 

841nm 

Laser Repetition Rate 

19.5MHz 

Ilium. Power at target 

« 240/iW average optical power 

Ilium. Beam Diameter at Target 

~ 1mm 

Acquisition Mode 

142 x 142 pixels scan 

centred on the head, 

covering an area of 

285 x 285mm at the scene 

Per-pixel acquisition time: 30 ms 

Total scan time: ss 10 minutes 

Histogram bin width 

16ps 

Histogram length 

586 bins (after gating) 

Temporal Response of System 

w 95ps FWHM 


TABLE I 

Measurement key parameters 
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60/tis 

300/.ts 

600/is 

3ms 

6ms 

30ms 

Av. Photon counts 

0.8 

4.2 

8.4 

42.0 

83.7 

418.6 

Empty pixels (%) 

48.7 

7.5 

1.7 

> 0.1 

0 

0 


TABLE II 

Average number of detected photons per pixel and proportion of empty pixels as a function of the 

ACQUISITION TIME. 



60/xs 

300ps 

600/is 

3ms 

6ms 

30ms 

X-corr+MLE 

1 

1 

1 

1 

1 

1 

Prop, method 

113 

123 

131 

152 

197 

347 


TABLE III 

Processing time (in minutes). 
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